Machine learning algorithm to predict postoperative bleeding complications after lateral decubitus percutaneous nephrolithotomy

Bleeding is a serious complication following percutaneous nephrolithotomy (PCNL). This study establishes a predictive model based on machine learning algorithms to forecast the occurrence of postoperative bleeding complications in patients with renal and upper ureteral stones undergoing lateral decubitus PCNL. We retrospectively collected data from 356 patients with renal stones and upper ureteral stones who underwent lateral decubitus PCNL in the Department of Urology at Peking University First Hospital-Miyun Hospital, between January 2015 and August 2022. Among them, 290 patients had complete baseline data. The data was randomly divided into a training group (n = 232) and a test group (n = 58) in an 8:2 ratio. Predictive models were constructed using Logistic Regression, Random Forest, and Extreme Gradient Boosting (XGBoost). The performance of each model was evaluated using Accuracy, Precision, F1-Score, Receiver Operating Characteristic curves, and Area Under the Curve (AUC). Among the 290 patients, 35 (12.07%) experienced postoperative bleeding complications after lateral decubitus PCNL. Using postoperative bleeding as the outcome, the Logistic model achieved an accuracy of 73.2%, AUC of 0.605, and F1 score of 0.732. The Random Forest model achieved an accuracy of 74.5%, AUC of 0.679, and F1 score of 0.732. The XGBoost model achieved an accuracy of 68.3%, AUC of 0.513, and F1 score of 0.644. The predictive model for postoperative bleeding after lateral decubitus PCNL, established based on machine learning algorithms, is reasonably accurate. It can be utilized to predict postoperative stone residue and recurrence, aiding urologists in making appropriate treatment decisions.


Introduction
[3] They can lead to urinary tract infections, urinary obstruction, hematuria, and impaired renal function. [4,5]Without timely or conservative treatment, kidney stones can progress to renal failure or even death. [6]Therefore, active intervention is essential for managing kidney stones.Common treatment modalities for stones include Extracorporeal Shock Wave Lithotripsy, Ureteroscopy, Percutaneous Nephrolithotomy (PCNL), and occasionally open surgery. [7]] Despite PCNL being a mature surgical technique in recent years, complications remain common, with approximately 23.3% of patients experiencing postoperative complications. [11]leeding is a serious complication that requires prompt control and management.While conservative methods are often sufficient to control most post-PCNL bleeding, a subset of patients with severe bleeding may require surgical interventions such as angiographic embolization. [12]Moreover, previous studies have found that postoperative bleeding can lead to transfusions, with transfusion rates reaching up to 55%. [13]Therefore, the establishment of a predictive model for post-PCNL bleeding holds clinical significance, providing guidance for clinicians in diagnosis and treatment.
Machine Learning methods, developed from branches of statistics, computer science, and artificial intelligence, enable us Informed consent was waived due to the retrospective design of the study.

The authors have no funding and conflicts of interest to disclose.
The datasets generated during and/or analyzed during the current study are not publicly available, but are available from the corresponding author on reasonable request.
to uncover complex relationships and patterns in large databases that classical statistical methods may not detect, resulting in improved and more useful predictive models. [14,15]In recent years, machine learning predictive models for urological diseases have been widely reported. [16,17]However, there is currently no reported machine learning predictive model for postoperative bleeding after lateral decubitus PCNL.This study aims to construct a machine learning predictive model for complications related to postoperative bleeding after lateral decubitus PCNL.

Study population
We retrospectively collected data from 356 patients with renal stones and upper ureteral stones who underwent lateral decubitus PCNL in the Department of Urology at Peking University First Hospital-Miyun Hospital, between January 2015 and August 2022.Among them, 290 patients had complete baseline data.Using 26 clinical data parameters such as gender, age, and stone size as variables, we constructed a machine learning predictive model with postoperative bleeding as the outcome.
In this study, the endpoint of the research was postoperative bleeding, defined as a decrease in hemoglobin greater than or equal to 20 g/L between preoperative and the first day postoperative.A decrease of <20 g/L was not considered a complication of postoperative bleeding.The formula for calculating the decrease in hemoglobin (g/L) is: Decrease = Preoperative hemoglobin -Hemoglobin on the first postoperative day.
Patients were included based on the following criteria: Confirmation of the presence of renal or upper ureteral stones through ultrasound, intravenous pyelography, or urologic system CT before surgery; Lateral decubitus position during PCNL; Patients with complete clinical data.
Exclusion criteria included: Patients with abnormal coagulation function; Patients with cardiorespiratory dysfunction unable to undergo surgery; Patients unable to cooperate with the study.
Patients meeting the inclusion criteria were randomly divided into a training group (n = 232) and a test group (n = 58) in an 8:2 ratio.The Synthetic Minority Over-sampling Technique algorithm was applied to augment the dataset, addressing data imbalance issues. [18]A machine learning algorithm was employed to construct a predictive model for postoperative bleeding after lateral decubitus PCNL using the training group, and the accuracy of the model was tested using the test group.The process flow is illustrated in Figure 1.
This study was conducted in accordance with the principles of the Helsinki Declaration (2013 revised edition) and received approval from the Ethics Committee of Peking University First Hospital-Miyun Hospital.Informed consent for this retrospective analysis was waived by the committee.

Study variables
The variables included in the model construction for this study are as follows: Gender, Age, BMI, Hypertension, Diabetes mellitus, Coronary Heart Disease, Lung Diseases, Brain Diseases, spinal deformity, Lesion side, Stone location, Stone size, Multiple stones, Puncture site, Channel type, Number of channels, Operation time, Intraoperative blood loss, Preoperative systolic blood pressure (SBP), Preoperative diastolic blood pressure (DBP), Intraoperative SBP, Intraoperative DBP, Preoperative heart rate, Intraoperative heart rate, Mode of anesthesia, Stone co-infection.

Machine learning analysis
Three models were constructed, including the Logistic Regression model, Random Forest model, and Extreme Gradient Boosting (XGBoost) model.Logistic Regression is a traditional model commonly used in many studies.Random Forest is an ensemble learning algorithm designed for predicting binary outcomes (classifiers).It creates a decision tree forest through bootstrap aggregation of samples and features.Random Forest can easily assess the importance or contribution of variables to the model. [19,20]The XGBoost model employs classification trees as weak learners, learning a binary logistic objective function.The boosting method redefines weak classifiers (decision trees) iteratively as residuals of the previous model, achieving higher predictive accuracy through multiple iterations and forming a strong classifier. [21]Figure 2A to B illustrate the parameters of the Random Forest and XGBoost models, respectively, showcasing some processes of the Random Forest model and the top ten most important variables of the XGBoost model.The performance of each predictive model was evaluated using Accuracy, Precision, F1 score, Receiver Operating Characteristic curves, and Area Under the Curve (AUC).

Statistical analysis
Statistical analysis was performed using SPSS version 22.0.Normally distributed continuous data are expressed as mean ± standard deviation, while skewed data are described using median (range).For continuous variables, t-tests were used for normally distributed variables, and the Mann-Whitney U test was employed for non-normally distributed variables.Chi-square tests and Fisher exact probability method were used to analyze categorical variables.

Patient characteristics
The baseline characteristics of the patients are presented in Table 1.The average age of the patients was 51.72 ± 13.11 years.Of the total, 179 (61.72%) were male, and 111 (38.28%) were female.The mean stone size was 2.86 ± 0.95 cm.The average surgical duration was 115.31 ± 49.67 minutes.The mean intraoperative blood loss was 19.25 ± 22.19 ml.Thirty-five patients experienced postoperative bleeding complications, resulting in an incidence rate of 12.07%.

Efficiency of machine learning models in predicting postoperative bleeding after lateral decubitus PCNL
The 290 patients were randomly divided into 2 datasets, with 80% assigned to the training group (n = 232) and 20% to the test group (n = 58).There were no statistically significant differences in various indicators between the 2 groups (P > .05)(Table 2).The accuracy of the Logistic model was 73.2%, with an AUC of 0.605 and an F1 score of 0.732.The Random Forest model achieved an accuracy of 74.5%, an AUC of 0.679, and an F1 score of 0.732.The XGBoost model showed an accuracy of 68.3%, an AUC of 0.513, and an F1 score of 0.644 (Fig. 2C and Table 3).The generated confusion matrix indicated that the Random Forest model had the best predictive performance among all models (Fig. 2D).

Discussion
PCNL is a safe and effective surgical procedure used for the removal of large, complex, and multiple kidney stones. [22]However, life-threatening bleeding complications may occur during and after PCNL.Adequate treatment options for bleeding resulting from PCNL, such as placing larger nephrostomy tubes, clamping the nephrostomy tube, balloon tamponade, and vascular embolization, have demonstrated good efficacy. [23,24]Early intervention with treatments like placing larger fistula tubes in high-risk patients can effectively prevent postoperative bleeding.Therefore, a reliable postoperative bleeding prediction model is beneficial for clinicians in making diagnostic and therapeutic decisions.
Previous research primarily focused on identifying factors influencing postoperative bleeding, providing references for clinicians to identify high-risk patients prone to postoperative bleeding.In the study by Tolga Akman et al, [25] diabetes, surgical time, number of surgeries, and stone type were found to be correlated with a decrease in hemoglobin levels.Srivastava et al [26] suggested that stone size is the sole important factor predicting post-PCNL bleeding.In the research by Jeong Kuk Lee et al, [27] BMI, stone size, stone location, surgical time, and preoperative renal pelvic dilation were identified as predictive factors for severe bleeding during PCNL.In our study, the Random Forest model identified multiple stones, stone size, preoperative heart rate, BMI, and intraoperative DBP as the top 5 influential variables.The XGBoost model highlighted age, BMI, stone size, multiple stones, and intraoperative SBP as the top 5 important features.Stone size, multiple stones, and BMI as significant features align well with previous studies and are close to clinical reality.
Currently, there is only one report on predicting post-PCNL bleeding through multivariate regression analysis using traditional statistical methods.This study represents the first application of a machine learning-based predictive model for post-PCNL bleeding.Traditional statistical methods, such as regression analysis, may not extract features from data as effectively as machine learning algorithms, resulting in a potential disadvantage in model construction with the same sample size.Giorgio Mazzon et al's study, [28] a prospective large-scale population study with 1980 patients, identified factors such as age (P = .041),BMI (P = .018),maximum stone diameter (P = .001),preoperative hemoglobin (P = .005),diabetes  The application of machine learning algorithms is a future trend in the integration of medicine and technology, given their strong self-learning capability, continuous improvement, and optimization from data.While the models constructed in this study have yet to be widely applied in clinical settings, it represents a future direction.The developed software can directly connect to electronic medical record systems, incorporating real-time data for model training and performance enhancement.Urologists can receive timely postoperative predictions from machine learning models, providing valuable treatment references.
This study has some limitations.Firstly, the model architecture is based on single-center data, introducing potential bias.Secondly, compared to machine learning models in other fields, the sample size used for model training and testing in this study is still limited.Additionally, machine learning models are often referred to as black-box models, making it challenging to interpret specific statistical patterns, limiting their generalizability in the medical field.

Conclusion
We have successfully developed a relatively accurate predictive model for postoperative bleeding following PCNL based on machine learning algorithms.This model serves to forecast complications related to postoperative bleeding, providing valuable assistance to urologists in making timely and appropriate treatment decisions in the early stages.

Figure 1 .
Figure 1.Flowchart of the study.

Figure 2 .
Figure 2. (A) It is visualization in parts of leaves based on the Random forest model.It should be noted that the visual model only shows a part of the leaves or principles of the decision tree and does not represent the entire model.(B) Top 10 important features of XGBoost model.(C) Typical receiver operating characteristic curve in 3 models.(D) Confusion matrix of 3 models.XGBoost = extreme gradient boosting.

Table 1
Basic characteristics of the patients.

Table 2
Basic characteristics of the patients of postoperative hemorrhage.The predictive model constructed based on these risk factors had an AUC of 0.73.In our study, the Random Forest model achieved a higher accuracy of 74.5%, with an AUC of 0.679.The slightly lower AUC in our model compared to Giorgio Mazzon et al may be due to differences in sample size.Further increasing the number of patients in the study for additional training is expected to enhance the model performance.